Search CORE

541 research outputs found

Analyzing the Performance of Multilayer Neural Networks for Object Recognition

Author: D.E. Rumelhart
D.G. Lowe
K. Fukushima
R.Q. Quiroga
S. Singh
Publication venue
Publication date: 01/01/2014
Field of study

In the last two years, convolutional neural networks (CNNs) have achieved an impressive suite of results on standard recognition datasets and tasks. CNN-based features seem poised to quickly replace engineered representations, such as SIFT and HOG. However, compared to SIFT and HOG, we understand much less about the nature of the features learned by large CNNs. In this paper, we experimentally probe several aspects of CNN feature learning in an attempt to help practitioners gain useful, evidence-backed intuitions about how to apply CNNs to computer vision problems.Comment: Published in European Conference on Computer Vision 2014 (ECCV-2014

arXiv.org e-Print Archive

CiteSeerX

Crossref

Contractive De-noising Auto-encoder

Author: C.C. Chang
C.J. Burges
D.E. Rumelhart
G.E. Hinton
G.E. Hinton
H. Bourlard
P. Vincent
Publication venue
Publication date: 01/01/2014
Field of study

Auto-encoder is a special kind of neural network based on reconstruction. De-noising auto-encoder (DAE) is an improved auto-encoder which is robust to the input by corrupting the original data first and then reconstructing the original input by minimizing the reconstruction error function. And contractive auto-encoder (CAE) is another kind of improved auto-encoder to learn robust feature by introducing the Frobenius norm of the Jacobean matrix of the learned feature with respect to the original input. In this paper, we combine de-noising auto-encoder and contractive auto- encoder, and propose another improved auto-encoder, contractive de-noising auto- encoder (CDAE), which is robust to both the original input and the learned feature. We stack CDAE to extract more abstract features and apply SVM for classification. The experiment result on benchmark dataset MNIST shows that our proposed CDAE performed better than both DAE and CAE, proving the effective of our method.Comment: Figures edite

arXiv.org e-Print Archive

Crossref

Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks

Author: A. Hyvärinen
D. Ciresan
D. Hubel
D.E. Rumelhart
J. Bergstra
J. Bergstra
K. Fukushima
M. Ranzato
M. Trebar
Y. LeCun
Publication venue
Publication date: 01/01/2014
Field of study

In this paper we propose and investigate a novel nonlinear unit, called

L_p

unit, for deep neural networks. The proposed

L_p

unit receives signals from several projections of a subset of units in the layer below and computes a normalized

L_p

norm. We notice two interesting interpretations of the

L_p

unit. First, the proposed unit can be understood as a generalization of a number of conventional pooling operators such as average, root-mean-square and max pooling widely used in, for instance, convolutional neural networks (CNN), HMAX models and neocognitrons. Furthermore, the

L_p

unit is, to a certain degree, similar to the recently proposed maxout unit (Goodfellow et al., 2013) which achieved the state-of-the-art object recognition results on a number of benchmark datasets. Secondly, we provide a geometrical interpretation of the activation function based on which we argue that the

L_p

unit is more efficient at representing complex, nonlinear separating boundaries. Each

L_p

unit defines a superelliptic boundary, with its exact shape defined by the order

p

. We claim that this makes it possible to model arbitrarily shaped, curved boundaries more efficiently by combining a few

L_p

units of different orders. This insight justifies the need for learning different orders for each unit in the model. We empirically evaluate the proposed

L_p

units on a number of datasets and show that multilayer perceptrons (MLP) consisting of the

L_p

units achieve the state-of-the-art results on a number of benchmark datasets. Furthermore, we evaluate the proposed

L_p

unit on the recently proposed deep recurrent neural networks (RNN).Comment: ECML/PKDD 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

3D freeform surfaces from planar sketches using neural networks

Author: C. Yuan
D.E. Rumelhart
J. Barhak
K. Hornick
K. Nezis
M. Clowes
M. Hoffman
M. Shpitalni
O. Karpenko
P. Michalik
P. Varley
S. Lim
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2006
Field of study

A novel intelligent approach into 3D freeform surface reconstruction from planar sketches is proposed. A multilayer perceptron (MLP) neural network is employed to induce 3D freeform surfaces from planar freehand curves. Planar curves were used to represent the boundaries of a freeform surface patch. The curves were varied iteratively and sampled to produce training data to train and test the neural network. The obtained results demonstrate that the network successfully learned the inverse-projection map and correctly inferred the respective surfaces from fresh curves

Crossref

Northumbria Research Link

Open Research Online (The Open University)

Brunel University Research Archive

Design of a General-Purpose MIMO Predictor with Neural Networks

Author: Asada H.
Bhat N.
Kang G. Shin
Rumelhart D.E.
Werbos P.J.
Xianzhong Cui
Publication venue: 'SAGE Publications'
Publication date: 01/01/1994
Field of study

A new multi-step predictor for multiple-input, multiple-output (MIMO) systems is proposed. The output prediction of such a system is represented as a mapping from its historical data and future inputs to future outputs. A neural network is designed to learn the mapping without re quiring a priori knowledge of the parameters and structure of the system. The major problem in de veloping such a predictor is how to train the neural network. In case of the back propagation algorithm, the network is trained by using the network's output error which is not known due to the unknown predicted future system outputs. To overcome this problem, the concept of updating, in stead of training, a neural network is introduced and verified with simulations. The predictor then uses only the system's historical data to update the configuration of the neural network and always works in a closed loop. If each node can only handle scalar operations, emulation of an MIMO mapping requires the neural network to be excessively large, and it is difficult to specify some known coupling effects of the predicted system. So, we propose a vector-structured, multilayer perceptron for the predictor design. MIMO linear, nonlinear, time-invariant, and time-varying systems are tested via simulation, and all showed very promising performances.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/68861/2/10.1177_1045389X9400500206.pd

Crossref

Deep Blue Documents at the University of Michigan

A Neural Networks Committee for the Contextual Bandit Problem

Author: D.E. Rumelhart
E. Kaufmann
G. Tesauro
K. Hornik
L. Bottou
L. Kocsis
P. Auer
P. Auer
P. Auer
R. Feraud
S.M. Kakade
T.L. Lai
W. Thompson
Publication venue
Publication date: 01/01/2014
Field of study

This paper presents a new contextual bandit algorithm, NeuralBandit, which does not need hypothesis on stationarity of contexts and rewards. Several neural networks are trained to modelize the value of rewards knowing the context. Two variants, based on multi-experts approach, are proposed to choose online the parameters of multi-layer perceptrons. The proposed algorithms are successfully tested on a large dataset with and without stationarity of rewards.Comment: 21st International Conference on Neural Information Processin

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

Recurrent Latent Variable Networks for Session-Based Recommendation

Author: Bastien Frédéric
Bengio Y.
Chatzis Sotirios P.
Duchi John
Glorot X.
Hidasi B.
Kingma D.
Kingma D. P.
Porteous Ian
Rendle S.
Rumelhart D.E.
Salakhutdinov Ruslan
Zhang Y.
Publication venue
Publication date: 13/06/2017
Field of study

In this work, we attempt to ameliorate the impact of data sparsity in the context of session-based recommendation. Specifically, we seek to devise a machine learning mechanism capable of extracting subtle and complex underlying temporal dynamics in the observed session data, so as to inform the recommendation algorithm. To this end, we improve upon systems that utilize deep learning techniques with recurrently connected units; we do so by adopting concepts from the field of Bayesian statistics, namely variational inference. Our proposed approach consists in treating the network recurrent units as stochastic latent variables with a prior distribution imposed over them. On this basis, we proceed to infer corresponding posteriors; these can be used for prediction and recommendation generation, in a way that accounts for the uncertainty in the available sparse training data. To allow for our approach to easily scale to large real-world datasets, we perform inference under an approximate amortized variational inference (AVI) setup, whereby the learned posteriors are parameterized via (conventional) neural networks. We perform an extensive experimental evaluation of our approach using challenging benchmark datasets, and illustrate its superiority over existing state-of-the-art techniques

arXiv.org e-Print Archive

Crossref

Ktisis

Explicit Computation of Input Weights in Extreme Learning Machines

Author: B. Widrow
C. Cortes
D.E. Rumelhart
G.-B. Huang
H.W. Kuhn
J. Tapson
L.L.C. Kasun
Y. LeCun
Publication venue
Publication date: 11/06/2014
Field of study

We present a closed form expression for initializing the input weights in a multi-layer perceptron, which can be used as the first step in synthesis of an Extreme Learning Ma-chine. The expression is based on the standard function for a separating hyperplane as computed in multilayer perceptrons and linear Support Vector Machines; that is, as a linear combination of input data samples. In the absence of supervised training for the input weights, random linear combinations of training data samples are used to project the input data to a higher dimensional hidden layer. The hidden layer weights are solved in the standard ELM fashion by computing the pseudoinverse of the hidden layer outputs and multiplying by the desired output values. All weights for this method can be computed in a single pass, and the resulting networks are more accurate and more consistent on some standard problems than regular ELM networks of the same size.Comment: In submission for the ELM 2014 Conferenc

arXiv.org e-Print Archive

Crossref

Soft Computing Models for the Development of Commercial Conversational Agents

Author: D. Griol
D.E. Rumelhart
E. Levin
J. Schatzmann
J. Williams
M.F. McTear
R. Pieraccini
S. Espana-Boquera
T. Paek
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Proceedings of: 6th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2011). Salamanca, April 6-8, 2011In this paper we present a proposal for the development of conversational agents that, on the one hand, takes into account the benefits of using standards like VoiceXML, whilst on the other, includes a module with a soft computing model that avoids the effort of manually defining the dialog strategy. This module is trained using a labeled dialog corpus, and selects the next system response considering a classification process based on neural networks that takes into account the dialog history. Thus, system developers only need to define a set of VoiceXML files, each including a system prompt and the associated grammar to recognize the users responses to the prompt. We have applied this technique to develop a conversational agent in VoiceXML that provides railway information in Spanish.Funded by projects CICYT TIN2008-06742-C02-02/TSI, CICYT TEC2008-06732-C02- 02/TEC, CAM CONTEXTS (S2009/TIC-1485), and DPS2008-07029-C02-02.Publicad

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Recommended from our members

Fewer epistemological challenges for connectionism

Author: A.S. d’Avila Garcez
A.S. d’Avila Garcez
A.S. d’Avila Garcez
A.S. d’Avila Garcez
D.E. Rumelhart
D.M. Gabbay
D.M. Gabbay
J. McCarthy
J.W. Lloyd
L.G. Valiant
P. Hitzler
P. Smolensky
R. Andrews
R. Setiono
R. Sun
S. Bader
S. Haykin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Seventeen years ago, John McCarthy wrote the note Epistemological challenges for connectionism as a response to Paul Smolensky’s paper 'On the proper treatment of connectionism'. I will discuss the extent to which the four key challenges put forward by McCarthy have been solved, and what are the new challenges ahead. I argue that there are fewer epistemological challenges for connectionism, but progress has been slow. Nevertheless, there is now strong indication that neural-symbolic integration can provide effective systems of expressive reasoning and robust learning due to the recent developments in the field

City Research Online

Crossref